Effective Use of Function Words for Rule Generalization in Forest-Based Translation

نویسندگان

  • Xianchao Wu
  • Takuya Matsuzaki
  • Jun'ichi Tsujii
چکیده

In the present paper, we propose the effective usage of function words to generate generalized translation rules for forest-based translation. Given aligned forest-string pairs, we extract composed tree-to-string translation rules that account for multiple interpretations of both aligned and unaligned target function words. In order to constrain the exhaustive attachments of function words, we limit to bind them to the nearby syntactic chunks yielded by a target dependency parser. Therefore, the proposed approach can not only capture source-tree-to-target-chunk correspondences but can also use forest structures that compactly encode an exponential number of parse trees to properly generate target function words during decoding. Extensive experiments involving large-scale English-toJapanese translation revealed a significant improvement of 1.8 points in BLEU score, as compared with a strong forest-to-string baseline system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

A QUADRATIC MARGIN-BASED MODEL FOR WEIGHTING FUZZY CLASSIFICATION RULES INSPIRED BY SUPPORT VECTOR MACHINES

Recently, tuning the weights of the rules in Fuzzy Rule-Base Classification Systems is researched in order to improve the accuracy of classification. In this paper, a margin-based optimization model, inspired by Support Vector Machine classifiers, is proposed to compute these fuzzy rule weights. This approach not only  considers both accuracy and generalization criteria in a single objective fu...

متن کامل

EFL Translation Students' Perspective toward Using Bilingual Dictionary in Translation of Polysemous Words

This research presented the use of bilingual dictionary and addressed the EFL translation students' points of view on the use of bilingual dictionary in translating polysemous words (English to Persian). Moreo- ver, it aimed at finding the possible relationship between the effect of using bilingual dictionary by stu- dents in translating polysemous words and their achieved scores. In the study ...

متن کامل

Multi-Word Unit Dependency Forest-based Translation Rule Extraction

Translation requires non-isomorphic transformation from the source to the target. However, non-isomorphism can be reduced by learning multi-word units (MWUs). We present a novel way of representating sentence structure based on MWUs, which are not necessarily continuous word sequences. Our proposed method builds a simpler structure of MWUs than words using words as vertices of a dependency stru...

متن کامل

Adaptive Rule-Base Influence Function Mechanism for Cultural Algorithm

This study proposes a modified version of cultural algorithms (CAs) which benefits from rule-based system for influence function. This rule-based system selects and applies the suitable knowledge source according to the distribution of the solutions. This is important to use appropriate influence function to apply to a specific individual, regarding to its role in the search process. This rule ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011